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Abstract 

In the reordering buffer problem (RBP), a server is asked to process a sequence of requests lying in a 
metric space. To process a request the server must move to the corresponding point in the metric. The 
requests can be processed slightly out of order; in particular, the server has a buffer of capacity k which 
can store up to k requests as it reads in the sequence. The goal is to reorder the requests in such a manner 
that the buffer constraint is satisfied and the total travel cost of the server is minimized. The RBP arises 
in many applications that require scheduling with a limited buffer capacity, such as scheduling a disk 
arm in storage systems, switching colors in paint shops of a car manufacturing plant, and rendering 3D 
images in computer graphics. 

We study the offline version of RBP and develop bicriteria approximations. When the underlying 
metric is a tree, we obtain a solution of cost no more than 9 OPT using a buffer of capacity 4k + 1 
where OPT is the cost of an optimal solution with buffer capacity k. Constant factor approximations 
were known previously only for the uniform metric (Avigdor-Elgrabli et al., 2012). Via randomized tree 
embeddings, this implies an O(logn) approximation to cost and O(l) approximation to buffer size for 
general metrics. Previously the best known algorithm for arbitrary metrics by Englert et al. (2007) 
provided an 0(log 2 fclogn) approximation without violating the buffer constraint. 

1 Introduction 

We consider the reordering buffer problem (RBP) where a server with buffer capacity k has to process a 
sequence of requests lying in a metric space. The server is initially stationed at a given vertex and at any 
point of time it can store at most k requests. In particular, if there are k requests in the buffer then the server 
must process one of them (that is, visit the corresponding vertex in the metric space) before reading in the 
next request from the input sequence. The objective is to process the requests in an order that minimizes 
the total distance travelled by the server. 

RBP provides a unified model for studying scheduling with limited buffer capacity. Such scheduling 
problems arise in numerous areas including storage systems, computer graphics, job shops, and information 
retrieval (see [HI [TBI 031 IS] ) • For example, in a secondary storage system the overall performance critically 
depends on the response time of the underlying disk devices. Hence disk devices need to schedule their disk 
arm in a way that minimizes the mean seek time. Specifically, these devices receive read/write requests 
which are located on different cylinders and they must move the disk arm to the proper cylinder in order 
to serve a request. The device can buffer a limited number of requests and must deploy a scheduling policy 
to minimize the overall service time. Note that we can model this disk arm scheduling problem as a RBP 
instance by representing the disk arm as a server and the array of cylinders as a metric space over read/write 
requests. 

The RBP can be seen to be NP-Hard via a reduction from the traveling salesperson problem. We study 
approximation algorithms. RBP has been considered in both online and offline contexts. In the online setting 
the entire input sequence is not known beforehand and the requests arrive one after the other. This setting 
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was considered by Englert et al. [TJ, who developed an 0(log 2 k log n)-competitive algorithm. To the best of 
our knowledge this is the best known approximation guarantee for RBP over arbitrary metrics both in the 
online and offline case. 

RBP remains NP-Hard even when restricted to the uniform metric (see [B]). In fact the uniform metric is 
an interesting special case as it models scheduling of paint jobs in a car manufacturing plant. In particular, 
switching paint color is a costly operation; hence, paint shops temporarily store cars and process them out 
of order to minimize color switches. Over the uniform metric, RBP is somewhat related to paging. However, 
unlike for the latter, simple greedy strategies like First in First Out and Least Recently Used yield poor 
competitive ratios (see 15j). Even the offline version of the uniform metric case does not seem to admit 
simple approximation algorithms. The best known approximation for this setting, due to Avigdor-Elgrabli 
et al. [3J, relies on intricate rounding of a linear programming relaxation in order to get a constant- factor 
approximation. 

The hardness of the RBP appears to stem primarily from the strict buffer constraint; it is therefore 
natural to relax this constraint and consider bicriteria approximations. We say that an algorithm achieves 
an (a, /3) bicriteria approximation if, given any RBP instance, it generates a solution of cost no more than 
a OPT using a buffer of capacity f3k. Here OPT is the cost of an optimal solution with buffer capacity k. 
There are few bicriteria results known for the RBP. For the offline version of the uniform metric case, a 
bicriteria approximation of (0(-),2 + e) for every e > was given by Chan et al. [§]. For the online version 
of this restricted case, Englert et al. [8] developed a (4, 4)-competitive algorithm. They further showed how 
to convert this bicriteria approximation into a true approximation with a logarithmic ratio. We show in 
Appendix [A] that such a conversion from a bicriteria approximation to a true approximation is not possible 
at small loss in more general metrics, e.g. the evenly-spaced line metric. In more general metrics, relaxing 
the buffer constraint therefore gives us significant extra power in approximation. 

We study bicriteria approximation for the offline version of RBP. When the underlying metric is a 
weighted tree we obtain a (9,4 + ¥) bicriteria approximation algorithm. Using tree embeddings of [9] this 
implies a (O(logn), 4+ ¥) bicriteria approximation for arbitrary metrics over n points. 

Other Related Work: Besides the work of Englert et al. [TJ, existing results address RBP over very 
specific metrics. RBP was first considered by Racke et al. |15j . They focused on the uniform metric with 
online arrival of requests and developed an 0(log 2 fc)-competitive algorithm. This was subsequently improved 
on by a number of results leading to an 0(^\og fc)-competitive algorithm pQ. 

With the disk arm scheduling problem in mind, Khandekar et al. [12] considered the online version of 
RBP over the evenly-spaced line metric (line graph with unit edge lengths) and gave an online algorithm 
with a competitive ratio of 0(log 2 n). This was improved on by Gamzu et al. [10] to an 0(logn)-competitive 
algorithm. 

Bicriteria approximations have been studied previously in the context of resource augmentation (see |14] 
and references therein). In this paradigm, the algorithm is augmented with extra resources (usually faster 
processors) and the benchmark is an optimal solution without augmentation. This approach has been applied 
to, for example, paging [T7J, scheduling PUH], and routing problems [16] , 

Techniques: We can assume without loss of generality that the server is lazy and services each request 
when it absolutely must — to create space in the buffer for a newly received request. Then after reading in 
the first k requests, the server must serve exactly one request for each new one received. Intuitively, adding 
extra space in the buffer lets us defer serving decisions. In particular, while the optimal server must serve 
a request at every step, we serve requests in batches at regular intervals. Partitioning requests into batches 
appears to be more tractable than determining the exact order in which requests appear in an optimal 
solution. This enables us to go beyond previous approaches (see [21 13]) that try to extract the order in which 
requests appear in an optimal solution. Wc enforce the buffer capacity constraint by placing lower bounds 
on the cardinalities of the batches. In particular, by ensuring that each batch is large enough, we make sure 
that the server "carries forward" few requests. Then the maximum buffer utilization can be bounded by the 
number of requests carried forward plus the number read before the next batch is processed. 
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A crucial observation that underlies our algorithm is that when the underlying metric is a tree, we can 
find vertices {vi}i that any solution with buffer capacity k must visit in order. This allows us to anchor the 
ith batch at Vi and equate the serving cost of a batch to the cost of the subtree spanning the batch and 
rooted at Vi. Overall, when the underlying metric is a tree, the problem of finding low-cost batches with 
cardinality constraints reduces to finding low-cost subtrees which are rooted at ViS, cover all the requests, 
and satisfy the same cardinality constraints. We formulate a linear programming relaxation, LP1, for this 
covering problem. 

Rounding LP1 directly is difficult because of the cardinality constraints. To handle this we round LP1 
partially to formulate another relaxation that is free of the cardinality constraints and is amenable to round- 
ing. Specifically, using a fractional optimal solution to LP1, we determine for each request j an interval of 
indices, T(j), such that any solution that assigns every request to a batch within its corresponding interval 
approximately satisfies the buffer constraint. This allows us to remove the cardinality constraints and in- 
stead formulate an interval-assignment relaxation LP2. In order to get the desired bicriteria approximation 
we show two things: first, the optimal cost achieved by LP2 is within a constant factor of the optimal cost 
for the given RBP instance; second, an integral feasible solution of LP2 can be transformed into a RBP 
solution using a bounded amount of extra buffer space. Finally we develop a rounding algorithm for LP2 
which achieves an approximation ratio of 2. 

2 Notation 

An instance of RBP is specified by a metric space over a vertex set V, a sequence of n vertices (requests), 
an integer k, and a starting vertex v . The metric space is represented by a graph G = (V, E) with distance 
function d : E — > R + on edges. We index requests by j. We assume without loss of generality that requests 
are distinct vertices. Starting at vo, the server reads requests from the input sequence into its buffer and 
clears requests from its buffer by visiting them in the graph (we say these requests are served). The goal 
is to serve all requests, having at most k buffered requests at any point in time, with minimum traveling 
distance. We denote the optimal solution as OPT. For the most part of this paper, we focus on the special 
case where G is a tree. 

We break up the timeline into windows as follows. Without loss of generality, n is a multiple of 2k + 1, 
i.e. n = (2k + l)m. For ie [to], we define window Wi to be the set of requests from (2k + l)(i — 1) + 1 to 
(2k + l)i. Let w(j) be the index of the window in which j belongs. The i-th time window is defined to be 
the duration in which the server read Wj. 

3 Reduction to Request Cover Problem 

In this section we show how to use extra buffer space to convert the RBP into a new and simpler problem 
that we call Request Cover. The key tool for the reduction is the following lemma which states that we can 
find for every window a vertex in the graph G that must be visited by any feasible solution within the same 
window. We call these vertices terminals. This allows us to break up the server's path into segments that 
start and end at terminals. 

Lemma 1. For each i, there exists a vertex Vi such that all feasible solutions with buffer capacity k must 
visit Vi in the i-th time window. 

Proof. Fix a feasible solution and i. We orient the tree as follows. For each edge e = (u, v), if after removing 
e from the tree, the component containing u contains at most k requests of Wi, then we direct the edge from 
u to v. Since |Wi| = 2k + 1, there is exactly one directed copy of each edge. 

An oriented tree is acyclic so there exists a vertex Vi with incoming edges only. We claim that the server 
must visit Vi during the i-th time window. During the i-th time window, the server reads all 2k + 1 requests 
of Wi. Since each component of the induced subgraph G[V\ {vi}] contains at most k requests of Wi and the 
server has a buffer of size k, it cannot remain in a single component for the entire time window. Therefore, 
the server must visit at least two components, passing by v i} at some point during the i-th time window. □ 
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For the remainder of the argument, we will fix the terminals v\, . . . , v m . Note that since G is a tree, there 
is a unique path visiting the terminals in sequence, and every solution must contain this path. For each i, 
let Pj denote the path from Vi—i to Vi. 

We can now formally define request covers. 

Definition 1 (Request cover). Let B be a partition of the requests into batches Bi, . . . ,B m , and £ be an 
ordered collection of m edge subsets E\, . . . ,E m C E. The pair (B,£) is a request cover if 

1. For every request j, the index of the batch containing j is at least w(j), i.e. the window in which j is 
released. 

2. For all i G [m] , EiU Pi is a connected subgraph spanning Bi . 

3. There exists a constant ft such that for all i £ [m], we have X);<i 1^1 — i^k + 1)* — Pk; we say that 
the request cover is /3-feasible. We call the request cover feasible if ft — I. 

The length of a request cover is d(£) — J^. d(Ei). 

Definition 2 (Request Cover Problem (RCP)). In the RCP we are given a metric space G — (V,E) with 
lengths d(e) on edges, a sequence of n requests, buffer capacity constraint k, and a sequence of m = n/(2k+l) 
terminals V\ , . . . , v m . Our goal is to find a feasible request cover of minimum length. 

We will now relate the request cover problem to the RBP. Let (B*,£*) denote the optimal solution to 
the RCP. We show on the one hand (Lemma [2]) that this solution has cost within a constant factor of OPT, 
the optimal solution to RBP. On the other hand, we show (Lemma [3]) that any /3-feasible solution to RCP 
can be converted into a solution to the RBP that is feasible for a buffer of size (2 + ft)k + 1 with a constant 
factor loss in length. 

Lemma 2. d(OPT) > d{£*). 

Proof. For each i, let Ei be the edges traversed by the optimal server during the i-th time window and 
let £ be the collection of edge subsets. We have g?(OPT) > Y^i^i-^i) = d(£), so it suffices to show that 
£ = (E\, . . . ,E m ) is a feasible request cover. By Lemma [TJ both Ei and Pi are connected subgraphs 
containing Uj for each i. Hence £ is connected. Since E\ contains the requests served in the Z-th time window 
for each I, and for each i the server has read (2k + l)i requests and served all except at most k of them by 
the end of the i-th time window, we get that J2i<i \Bi\ ^ (2& + l)i — k. This proves that £ is a feasible 
request cover. □ 

Next, consider a request cover (B, £). We may assume without loss of generality that for alH, -EjflPj = 0. 
This observation implies that B, can be partitioned into components Ei(p) for each vertex p £ Pi, where 
Ei(p) is the component of Ei containing p. 

We will now define a server for the RBP, Batch-Server(£?, £), based on the solution (B, £). Recall that 
the server has to start at vq. In the i-th iteration, it first buffers all requests in window Wi. Then it moves 
from Vi-i to Vi and serves requests of Bi as it passes by them. 



Algorithm 1 Batch-Server(S, £ ) 
l: Start at vq 
2: for i — I to to do 

3: (Buffering phase) Read Wi into buffer 

4: (Serving phase) Move from V{-i to Vi along Pi, and for each vertex p £ Pi, perform an Eulerian tour 

of Ei(p). Serve requests of Bi along the way. 
5: end for 



Lemma 3. Given a ft -feasible request cover (B,£), Batch-Server(S, £) is a feasible solution to the RBP 
instance with a buffer of size (2 + ft)k + I, and has length at most rf(OPT) + 2d(£). 
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Proof. We analyze the length first. In iteration i, the server uses each edge of Pi exactly once. Since Ei is 
a disjoint union of Ei(p) for p E Pi, the server uses each edge of Ei twice during the Eulerian tours of E^s 
components. The total length is therefore 

53 d ( p i) + 53 2d ( E ^ - d (° pT ) + 2d ( £ )- 

i i 

Next, we show that the server has at most (2 + f$)k + 1 requests in its buffer at any point in time. We 
claim that all of P>i is served by the end of the i-th iteration. Consider a request j that belongs to a batch 
Bi. Since i is at least as large as w(j), the request has already been received by the ith phase. The server 
visits j'a location during the ith iteration and therefore services the request at that time if not earlier. This 
proves the claim. 

The claim implies that the server begins the (i + l)-th iteration having read (2k + l)i requests and served 
Tli<i \Bi\ > (2k + l)i — f3k requests, that is, with at most (3k requests in its buffer. It adds 2k + 1 requests 
to be the buffer during this iteration. So it uses at most (2 + j3)k + 1 buffer space at all times. □ 



4 Approximating the Request Cover Problem 



We will now show how to approximate the request cover problem. Our approach is to start with an LP 
relaxation of the problem, and use the optimal fractional solution to the LP to further define a simpler 
covering problem which we then approximate in Section [4. 2| 



4.1 The request cover LP and the interval cover problem 

The integer linear program formulation of RCP is as follows. To obtain an LP relaxation we relax the last 
two constraints to x(i,j),y(e,i) € [0, 1]. 



minimize 53 53 2/( e ' *)^ e 

i e 




subject to 53 x {j^) — 1 


Vj 


w(j)<i 




53 53x(j,z')>(2fc+i)z-fc 




j:w(j)<i i'<i 




y(e,i) > x(j,i) 


Vi, j, e e Rji 


x(j,i) £ {0,1} 


Vi, j 


y(e,i) e {0,1} 


\fi, e 



(LP1) 



Here the variable x(j, i) indicates whether request j is assigned to batch Bi and the variable y(e, i) indicates 
whether edge e is in Ei. Recall that the edge set Ei along with path Pi should span Bi. Let Rji denote the 
(unique) path in G from j to P \ . The third inequality above captures the constraint that if j is assigned to 
Bi and e € Rji, then e must belong to Ei. 

Let (x*,y*) be the fractional optimal solution to the linear relaxation of (LP1). Instead of rounding 
(x*,y*) directly to get a feasible request cover, we will show that it is sufficient to find request covers that 
"mimic" the fractional assignment x* but do not necessarily satisfy the cardinality constraints on the batches 
(i.e. the second set of inequalities in the LP). To this end we define an interval request cover below. 

Definition 3 (Interval request cover). For each request j , we define the service deadline h(j) = min{i > 
w (j) '■ J2i<i x * (i)0 — 1/2} and the service interval T(j) = [w(j),h(j)]. A request cover (B,£) is an interval 
request cover if it assigns every request to a batch within its service intervals. 
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In other words, while x* "half-assigns" each request no later than its service deadline, an interval request 
cover mimics x* by integrally assigning each request no later than its service deadline. The following is a 
linear programming formulation for the problem of finding minimum length interval request covers. 



minimize y(e, i)d e 

i e 




subject to x (j>i) — l 


Vj 






y(e,i) > x(j,i) 


Vj,» G r(j),e G Rji 


x(J,i),y(e,i) G [0,1] 





(LP2) 



Let (x,y) be the fractional optimal of ( LP2[ ). We now show that interval request covers are 2-feasible 
request covers and that d(y) < 2d(y*). Since d(y*) < d(£*), it would then suffice to round (LP2). 



Lemma 4. Interval request covers are 2-feasible. 

Proof. Fix i. Let Hi := {j : h(j) < i} denote the set of all requests whose service intervals end at or before 



the ith time window. We first claim that \Hi\ > (2k + l)i — 2k. In particular, the second constraint of (LP1 ) 
and the definition of Hi gives us 

(2k+i)i-k= 5>*&o= E 5>W)+ E 

j:w(j)<ii'<i jeHi-.w(j)<ii'<i j^Hi:w(j)<i V <i 



: E 1 

jeHi-.w(j)<i 



E 



i 



1 



{(2k + l)i-\Hi\). 



j£H t :w(j)<i 



The claim now follows from rearranging the above inequality. 

Note that in an interval request cover, each request in Hi is assigned to some batch P>i with I < i. 
Therefore, 

J2\Bi\> \Hi\ > (2k + l)i - 2k. 

l<i 

□ 



We observe that multiplying all the coordinates of x* and y* by 2 gives us a feasible solution to (LP2| 
Thus we have the following lemma. 

Lemma 5. We have d(y) < 2d(y*). 



Note that the lemma says nothing about the integral optimal of (LP2) so a solution that merely ap- 
proximates the optimal integral interval request cover may not give a good approximation to the RBP, and 
we need to bound the integrality gap of the LP. In the following subsection, we show that we can find an 
interval request cover of length at most 2d(y). 



4.2 Approximating the Interval Assignment LP 

Before we describe the general approximation , we consider two special cases for insight. 



Example: single edge. Suppose the tree consists of a single unit-length edge e = (u,v), all requests 
reside at u, and all terminals at v. In this case, Rji — {e} for all pairs j and i so the second set of constraints 
in (LP 2 I is simply 

y{i)>x(j,i) Vj,i€T(j) 



G 



where we write y(i) for y(e,i). A minimum solution satisfies these constraints with equality. Summing over 
i € r(j), we get that in this case (LP2| is equivalent to 



minimize 7 2/(*) 

i 

subject to \_. V{i) — 1 Y? 



This is exactly the linear relaxation for the hitting seljj problem where the sets we want to hit are 
intervals. While the general hitting set problem is hard, it turns out that this special case can be solved 
exactly in polynomial time and the relaxation has no integrality garj^j Thus, we get an optimal solution via 
a reduction to the minimum interval hitting set problem: compute a minimum hitting set M for the set of 
intervals I := {r(j)}, and then add e to for i G M. 



Example: two edges. Suppose the tree is a line graph consisting of three vertices u\, 112 and v with unit- 
length edges ei = (ui,v) and e? — (u 2 ,Ui) (Figure [TJa)). Requests reside at U\ and 112, and all terminals at 
v. For each i and j residing at u\, we have Rji — {ei}. For each i and j residing at u? we have Rji — {ei, 62}. 



Thus feasible solutions to ( LP2 ) satisfy the constraints 



^2 y( e i'*) ^ 1 Vj, 

*er(i) 
ier(i) 

The constraints suggest that the vector y(e±,-) is a fractional hitting set for the collection of intervals 
I(ei) :— {r(j)}, and y(e 2 ,-) for I(e 2 ) := {r(j) : j 6 ^2}- In light of the single-edge special case, a naive 
approach is to first compute minimum hitting sets M(ei) and Mie^) for I(ei) and Tie?)-, respectively. Then 
we add e\ to Ei for i £ M(ei), and e? to for i € M(e2). However, the resulting edge sets may not be 
connected. Instead, we make use of the following crucial facts: 

(1) We should include ei in Ei only if e\ € Ei, and, 

(2) Minimal hitting sets are at most twice minimum fractional hitting sets (see Lemma [9]). 

These facts suggest that we should first compute a minimal hitting set M (ei) for X(ei) and then compute 
a minimal hitting set Mie?) for T-{e,2) with the constraint that M^e?) Q M(e\). This is a valid solution to 



(LP2) since T-ie-i) C I(ei). We proceed as usual to compute £. The resulting £ is connected by (1) and 



d{£) < 2d(y) by (2). 



General case. Motivated by the two-edge example, at a high level, our approach for the general case is 
as follows: 

1. We construct interval hitting set instances over each edge. 

2. We solve these instances starting from the edges nearest to the paths Pi first. 

3. We iteratively "extend" solutions for the instances nearer the paths to get minimal hitting sets for the 
instances further from the paths. 

A subset X of a universe U is a hitting set for S C 2 U if X n 5 ^ for all S £ S. 
2 One way to see this is that the columns of the constraint matrix has consecutive ones, and thus the constraint matrix is 
totally unimodular. 
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Figure 1: (a) The two-edge example; (b) Ri is the path from ji to Pj, for i £ {1, 2, 3}. Note that arc (i>i, «a) 
precedes (u, and no arc precedes (wi,Wa). 



We then use Lemma [9] to argue a 2-approximation on an edge-by-edge basis. 

Figure |TJb) gives an example of an instance of interval request cover. Note that whether an edge is closer 
to some Pi along a path Rji for some j depends on which direction we are considering the edge in. We 



therefore modify (LP2) to include directionality of edges, replacing each edge e with bidirected arcs and 



directing the paths Rji from j to Pi . 



minimize y(a, i)d a 

i a 




subject to x(j, i) > 1 


Vj 






y(a,i) > x(j,i) 


Vj,« g r(j),a e 


x{j,i),y{a,i) G [0,1] 


Vi, j,a 



(LP2') 



For every edge e and window i, there is a single orientation of edge e that belongs to Rji for some j. So 



there is a 1-1 correspondence between the variables y(e,i) in (LP2) and the variables y(a,i) in (LP2'|, and 



the two LPs are equivalent. Henceforth we focus on (LP2' 



Before presenting our final approximation we need some more notation. 

Definition 4. For each request j , we define Rj to be the directed path from j to Ujgr(j) ^or each arc a, 
we define C{a) — {j : a € Rj} and the set of intervals 1(a) = {T(j) : j € C(a)}. We say that a is a cut arc 
ifC{a)^%. 

We say that an arc a precedes arc a' , written a -< a' , if there exists a directed path in the tree containing 
both the arcs and a appears after a' in the path. 

Lemma 6. Feasible solutions (x,y) of ( |LP2'[ ) satisfy the following set of constraints for all arcs a: 

J2 y(o,i)>i VjeC(a). 
ier(j) 



Proof. Let (x, y) be a feasible solution of (LP2'). Fix an arc a and j G C(a). For each i g r(j), we 
have a € i?^ since Rj is a path from j to a connected subgraph containing Pj. By feasibility, we have 
y(a,i) > x(j,i). Summing over r(j), we get Sigr(i) 2/( a ' *) — Sigr(j) *) — ^ w bere the last inequality 
follows from feasibility. □ 



We are now ready to describe the algorithm. At a high level, Algorithm [2] does the following: initially, it 
finds a cut arc a with no cut arc preceding it and computes a minimal hitting set M(a) for 1(a); iteratively, 
it finds a cut arc a whose preceding cut arcs have been processed previously, and minimally "extends" the 
hitting sets M(o') computed previously for the preceding arcs a' to form a minimal hitting set M(a). 



Algorithm 2 Greedy extension 

1: U^{a: C(a) ^ 0} 

2: Aj, «- for all i 

3: M(a) <— for all arcs a 

4: while U ^ do 

5: Let a be any arc in U 

6: while there exists a' -< a in U do 

7: a <- a' 

8: end while 

9: Let a = (it, v) 

10: F(o) <- {i : V G Pi} U U„:(„ lU ,)-<« M (K «0) 

11: Set M(a) C F(a) to be a minimal hitting set for the intervals 1(a) 

12: A l <- Ai U {a} for all i G M (a) 

13: f/«-E/\{a} 

14: end while 

15; f(j)*~ nrin{i G : j incident to Aj or Pj} for all j 

16: Bt <- {j : f(j) = i} for all i 

17: return A = (A x , . . . , A m ), B = (B u . . . , B m ) 



We prove that Algorithm [2] actually manages to process all cut arcs a and that F(a) is a hitting set for 
1(a). First, we make the following observation. 

Lemma 7. For each iteration, the following holds. 

1. If U ^ 0, the inner 'while' loop finds an arc. 

2. F(a) is a hitting set for the intervals 1(a). 

Proof. Since we have a bidirected tree and an arc does not precede its reverse arc, the inner 'while' loop 
does not repeat arcs and hence it stops with some arc. This proves the first statement. 

We prove the second statement by induction on the algorithm's iterations. In the first iteration, the 
set U consists of cut arcs so o' / a for all cut arcs a'. Therefore, for all T(j) G 1(a), a is the arc on Rj 
closest to Uier(j) P% anc ^ v G Uierfj) ^ ms P roves the base case. Now we prove the inductive case. Fix 
an interval T(j) G 1(a). If a is the arc on Rj closest to Uier(j) ^» ano ^ v e Uierfj) ^ tnen ^( a ) n ^0) ^ ®- 
If not, then there exists a neighboring arc (v,w) G Rj closer to Uier(j) We have that T(j) G I((v,w)) 
and (v,w) -< a. Since the algorithm has processed all cut arcs preceding a, by the inductive hypothesis we 
have F((v,w)) (1 T(j) ^ 0. This implies that M((v,w)) is a hitting set for l((v,w)) and so F(a) n T(j) 7^ 0. 
Hence, -F(a) is a hitting set for 1(a). □ 

Let Sj be the set of edges whose corresponding arcs are in Aj and £ = (Ei, . . . , E m ), i.e. the undirected 
version of A. 

Lemma 8. (B, £) is an interval request cover. 

Proof. The connectivity of U Pj follows from the fact that the algorithm starts with Aj = 0, and in each 
iteration an arc a = (u, v) is added to Aj only if v G Pj or v is incident to some edge previously added to Aj. 

Now it remains to show that f(j) G for all requests j, i.e. that there exists i G such that j is 
incident to Aj or Pj. If Rj = 0, then j G Ujer(j) O n * ne other hand if Rj 7^ 0, then let a G Rj be the 
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arc incident to j. Since the algorithm processes all cut arcs, we have a € Uierfj) ^ an< ^ thus j is incident 
to Uier(j) ^i- I n both cases, we have /(j) G □ 

Next, we analyze the cost of the algorithm. Let D(a) be the number of disjoint intervals in 1(a). 

Lemma 9. D(a) > \M(a)\/2 for all arcs a. 

Proof. Let i\ < . .. < i\M(a)\ be the elements of M(a). For each 1 < I < \M(a)\, there exists an interval 
T(ji) G l{a) such that M(a) n r(j;) = {i/}, because otherwise M(a) \ {i} would still be a hitting set, 
contradicting the minimality of M(a). We observe that the intervals T(j{) and T(ji + 2) are disjoint since 
T(ji) contains i\ and T(ji +2 ) contains ii + 2 but neither contains ii+x- Therefore, the set of [|M(a)|/2] 
intervals : 1 < I < |M(a)| and Z odd} is disjoint. □ 

Lemma 10. d{£) < 2d(y). 

Proof. Fix an arc a. From Lemmas [6] and [9j we get 

Y,y{a,i)>D(a)>\M(a)\/2. 

i 

Since d(£ ) = d(A), we have 

d{£) =^|M(o)| • 4 



□ 



Together with Lemmas 
Ad(£*). Lemmas [2] and 



and [5] we have that (B,£) is a 2-strict request cover of length at most 4d(y*) < 
imply that Batch-Server(S, £) travels at most 9 OPT and uses a buffer of 



capacity 4fc + 1. This gives us the following theorem. 

Theorem 11. There exists an offline (9 , 4 + 4) -bicriteria approximation for RBP when the underlying 
metric is a weighted tree. 

Using tree embeddings of 0, we get 

Theorem 12. There exists an offline (0(log n), 4 + -bicriteria approximation for RBP over general met- 
rics. 
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A Gap Between Bicriteria and True Approximations 

In this section we prove that there exists an instance on the evenly-spaced line metric in which the optimal 
offline solution with a buffer of size fc/4 has to travel fi(fc) times the distance of the optimal offline solution 
with a buffer of size k. 

We consider a line graph L with 2 k vertices p\ < . . . < p 2 fc and unit-length edges. The input is a sequence 
of requests described by a binary tree S of depth k. Let r be the root of S. We denote the subtree rooted at 
a vertex v by S(v). Let k be the i-th leaf according to the preordering of the tree. We define the destination 
label of vertex v to be t(v) — max{i : l; t G S'(w)} and the origin label of v to be s(v) — min{z : ^ G 5(w)}. 
That is, t(v) and s(v) are the highest and lowest indices of any leaf in the subtree rooted at v, respectively. 
The input sequence is constructed as follows. First we obtain the sequence of vertices according to the 
preordering of the tree. Then we replace each non-leaf vertex v in the sequence with a request lying at Pt( v ) 
on the line, and each leaf vertex U with a block (which we refer to as a leaf block) of k requests lying at Pi 
on the line. For non-leaf vertices, we overload notation and use v to refer both to the vertex in the binary 
tree and the corresponding request. 

Let OPT(fc) and OPT(fc/4) be the optimal offline solutions to the above input sequence that use buffers 
of capacity k and fe/4, respectively. 

Theorem 13. We have OPT(fc/4) > Q(k)OPT(k). 
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Figure 2: Example for fc = 2 



Example 1. For k = 2, the line metric is represented by the integers 1,2,3,4 and the input sequence is 



We present a server k— Server that uses a buffer of size at most fc and travels a distance of 2 k — 1 on 
the above input sequence. 



Algorithm 3 k— Server 

1: for i = 1 to 2 k do 

2: Move to Pi 

3: Serve all requests v from the input that has t(v) = i 
4: end for 



Lemma 14. On the above input sequence, k— Server uses a buffer of size at most k and travels a distance 



Proof. Since k— Server visits each vertex of the line graph exactly once, it travels a distance of 2 — 1. 

At the beginning of the i-th iteration, k— Server has just finished reading the (i — l)-th leaf block and 
is at Pi. Furthermore, it has also served all requests that reside at pi, . . . ,Pi-%. Thus, it needs to maintain 
in its buffer only the requests v up till the i-th leaf block that have t(v) > i. Since the input sequence is 
constructed using the preordering of S, these requests correspond to the ancestors of leaf U in S. The tree S 
is of depth k so it needs to maintain at most k — 1 requests in its buffer at all times, in addition to a space 
of 1 that is needed to read requests from the input. □ 

Next, we show that OPT(fc/4) > | OPT(fc). Let Server be an optimal server with a buffer of size fc/4 
for the above input sequence. We analyze the movement of Server in phases. We define the i-th phase to 
be the duration starting from the time the last request of the (i — l)th leaf block is read to the time the last 
request of the i-th leaf block is read. 

Lemma 15. At the end of the i-th phase, Server is atpi. 

Proof. Since the requests of the i-th leaf block all lie at pi on the line metric, we assume w.l.o.g. that either 
the entire block is served together or buffered together. However, the block is of length fc, thus Server must 
serve the entire block. □ 

For request v, let d(v) be the distance travelled between p t ^ and the previously served request, and 
D{v) = J2u<eS(v) d{v). Then, the total cost to serve non-leaf vertices is D(r) — J2 V d{v). Wc define d to be 



4,2,1,1,2,2,4,3,3,4,4. 



of2 k 



- 1. 
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the contents of Server's buffer at the end of the i-th phase, i.e. when it reads the last request of the i-th 
block. Let CM = an S(v) and C(v) = £*[.(„),*(„)] \a(v)\. 
Let h(v) denote the height of v. 

Lemma 16. We have C{v) + D(v) > h M2 h< -'^ for all vertices v. 

Proof. We use a proof by induction on the height of v. For the base case, v is a leaf. The base case follows 
from the fact that a leaf has height and both C(v) and D(v) are non-negative. We consider the inductive 
case next. Let v% and Vi be the left and right children of v, respectively. Request v is read in the s(v)- 
th phase. Suppose v is served in the i'-th phase. Since Ci(v) = Ci{vi) U Ci(v2) U {v} if v € Ci(v) and 
d(v) = C l {vi) U C l {v 2 ) if v $ Ci(v), we get that 

C(v)= J2 |Ci(«i)i + + - 1]|- 

ie[s(v),t(v)] 

We observe that [s(v),t(v)] = [s(vi),t(vi)] U [s(v2),t(v2)], s(vi) — s(v) and t(v2) — t(v). Suppose that 
i' G [s(v\), t(vi)} and the request served just before v is v'. Lemma 15 implies that the server is at Pv-x at the 



beginning of the i'-th phase, so w.l.o.g. t(v') < t(v). The input sequence is obtained using the preordering 
of S, so the server has not read any request u with t(u) £ [s(v2),t(v 2 ))- Hence, we have t(v') < s(w 2 ) so the 
server must have traversed at least p s (v 2 )iPs(v)+ii ■ ■ ■ >Pt(v 2 ) to serve v. So, we have that d(v) > 2 h< ~ v ^~ 1 . 

On the other hand, if i' € [s(u 2 ),t(u 2 )] then, \[s(v),i' - i]\ > |[s(vi),<(«i)]| = 2 h ^' 1 . Thus, either 
\[s(v),i' - 1]| or d(v) is at least 2 h ^~ 1 . Since D{v) = D(vi) + D(v 2 ) + d(v), we get 

C(v) + D(v) = C( Vl ) + C(v 2 ) + | [«(»), if - 1]| + + D(«a) + d(v) 

> C(vi) + C(v 2 ) + D( Vl ) + D(«a) + 2 h ^- 1 

~ 2 ' 

where the second inequality follows from applying the inductive hypothesis on both v\ and V2- d 

Server cannot buffer more than fc/4 requests at any point in time therefore \Ci\ < fc/4 for all i. Applying 
""to the root r implies that D(r) > \2 k - \2 k = \2 k . Since OPT(fc/4) > D(r), this completes the 
leorem 1131 



Lemma 
proof of 
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